236 research outputs found

    Estimating the Area under a Receiver Operating Characteristic Curve For Repeated Measures Design

    Get PDF
    The receiver operating characteristic (ROC) curve is widely used for diagnosing as well as for judging the discrimination ability of different statistical models. Although theories about ROC curves have been established and computation methods and computer software are available for cross-sectional design, limited research for estimating ROC curves and their summary statistics has been done for repeated measure designs, which are useful in many applications, such as biological, medical and health services research. Furthermore, there is no published statistical software available that can generate ROC curves and calculate summary statistics of the area under a ROC curve for data from a repeated measures design. Using generalized linear mixed model (GLMM), we estimate the predicted probabilities of the positivity of a disease or condition, and the estimated probability is then used as a bio-marker for constructing the ROC curve and computing the area under the curve. The area under a ROC curve is calculated using the Wilcoxon non-parametric approach by comparing the predicted probability of all discordant pairs of observations. The ROC curve is constructed by plotting a series of pairs of true positive rate (sensitivity) and false positive rate (1- specificity) calculated from varying cuts of positivity escalated by increments of 0.005 in predicted probability. The computation software is written in SAS/IML/MACRO v8 and can be executed in any computer that has a working SAS v8 system with SAS/IML/MACRO.

    Sample Size Calculation and Power Analysis of Time-Averaged Difference

    Get PDF
    Little research has been done on sample size and power analysis under repeated measures design. With detailed derivation, we have shown sample size calculation and power analysis equations for timeaveraged difference to allow unequal sample sizes between two groups for both continuous and binary measures and explored the relative importance of number of unique subjects and number of repeated measurements within each subject on statistical power through simulation

    Estimating the Area under a Receiver Operating Characteristic Curve For Repeated Measures Design

    Get PDF
    The receiver operating characteristic (ROC) curve is widely used for diagnosing as well as for judging the discrimination ability of different statistical models. Although theories about ROC curves have been established and computation methods and computer software are available for cross-sectional design, limited research for estimating ROC curves and their summary statistics has been done for repeated measure designs, which are useful in many applications, such as biological, medical and health services research. Furthermore, there is no published statistical software available that can generate ROC curves and calculate summary statistics of the area under a ROC curve for data from a repeated measures design. Using generalized linear mixed model (GLMM), we estimate the predicted probabilities of the positivity of a disease or condition, and the estimated probability is then used as a bio-marker for constructing the ROC curve and computing the area under the curve. The area under a ROC curve is calculated using the Wilcoxon non-parametric approach by comparing the predicted probability of all discordant pairs of observations. The ROC curve is constructed by plotting a series of pairs of true positive rate (sensitivity) and false positive rate (1- specificity) calculated from varying cuts of positivity escalated by increments of 0.005 in predicted probability. The computation software is written in SAS/IML/MACRO v8 and can be executed in any computer that has a working SAS v8 system with SAS/IML/MACRO

    Learn from Yesterday: A Semi-Supervised Continual Learning Method for Supervision-Limited Text-to-SQL Task Streams

    Full text link
    Conventional text-to-SQL studies are limited to a single task with a fixed-size training and test set. When confronted with a stream of tasks common in real-world applications, existing methods struggle with the problems of insufficient supervised data and high retraining costs. The former tends to cause overfitting on unseen databases for the new task, while the latter makes a full review of instances from past tasks impractical for the model, resulting in forgetting of learned SQL structures and database schemas. To address the problems, this paper proposes integrating semi-supervised learning (SSL) and continual learning (CL) in a stream of text-to-SQL tasks and offers two promising solutions in turn. The first solution Vanilla is to perform self-training, augmenting the supervised training data with predicted pseudo-labeled instances of the current task, while replacing the full volume retraining with episodic memory replay to balance the training efficiency with the performance of previous tasks. The improved solution SFNet takes advantage of the intrinsic connection between CL and SSL. It uses in-memory past information to help current SSL, while adding high-quality pseudo instances in memory to improve future replay. The experiments on two datasets shows that SFNet outperforms the widely-used SSL-only and CL-only baselines on multiple metrics.Comment: Accepted by AAAI-202

    A method for analyzing censored survival phenotype with gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Survival time is an important clinical trait for many disease studies. Previous works have shown certain relationship between patients' gene expression profiles and survival time. However, due to the censoring effects of survival time and the high dimensionality of gene expression data, effective and unbiased selection of a gene expression signature to predict survival probabilities requires further study.</p> <p>Method</p> <p>We propose a method for an integrated study of survival time and gene expression. This method can be summarized as a two-step procedure: in the first step, a moderate number of genes are pre-selected using correlation or liquid association (LA). Imputation and transformation methods are employed for the correlation/LA calculation. In the second step, the dimension of the predictors is further reduced using the modified sliced inverse regression for censored data (censorSIR).</p> <p>Results</p> <p>The new method is tested via both simulated and real data. For the real data application, we employed a set of 295 breast cancer patients and found a linear combination of 22 gene expression profiles that are significantly correlated with patients' survival rate.</p> <p>Conclusion</p> <p>By an appropriate combination of feature selection and dimension reduction, we find a method of identifying gene expression signatures which is effective for survival prediction.</p

    Learning Effective NeRFs and SDFs Representations with 3D Generative Adversarial Networks for 3D Object Generation: Technical Report for ICCV 2023 OmniObject3D Challenge

    Full text link
    In this technical report, we present a solution for 3D object generation of ICCV 2023 OmniObject3D Challenge. In recent years, 3D object generation has made great process and achieved promising results, but it remains a challenging task due to the difficulty of generating complex, textured and high-fidelity results. To resolve this problem, we study learning effective NeRFs and SDFs representations with 3D Generative Adversarial Networks (GANs) for 3D object generation. Specifically, inspired by recent works, we use the efficient geometry-aware 3D GANs as the backbone incorporating with label embedding and color mapping, which enables to train the model on different taxonomies simultaneously. Then, through a decoder, we aggregate the resulting features to generate Neural Radiance Fields (NeRFs) based representations for rendering high-fidelity synthetic images. Meanwhile, we optimize Signed Distance Functions (SDFs) to effectively represent objects with 3D meshes. Besides, we observe that this model can be effectively trained with only a few images of each object from a variety of classes, instead of using a great number of images per object or training one model per class. With this pipeline, we can optimize an effective model for 3D object generation. This solution is one of the final top-3-place solutions in the ICCV 2023 OmniObject3D Challenge

    Switching of easy-axis to easy-plane anisotropy in cobalt(ii) complexes

    Get PDF
    A tetranuclear cubane-type complex [Co4(ntfa)4(CH3O)4(CH3OH)4] (1) with a {Co4O4} core, and a mononuclear complex [Co(ntfa)2(CH3OH)2] (2) have been rationally obtained by adjusting the ratio of the β-diketonate and Co(II) ions, with the synthetic processes being monitored by in situ microcalorimetry. Then, following synthetic conditions to obtain 2, but using three distinct N-donor coligands - 2,2'-bipyridyl (bpy), 6,6'-dimethyl-2,2'-bipyridyl (6,6-(CH3)2-bpy) and 5,5'-dimethyl-2,2'-bipyridyl (5,5-(CH3)2-bpy) - three novel mononuclear complexes have been obtained, [Co(ntfa)2(bpy)2] (3), [Co(ntfa)2(6,6-(CH3)2- bpy)2] (4) and [Co(ntfa)2(5,5-(CH3)2-bpy)2] (5). The introduction of different capping coligands - as singlecrystal X-ray crystallography ascertains - fine-tunes the structures, with changes in both the distortion degree of the coordination geometry and the intermolecular interactions, which have a direct impact on the magnetic properties of these complexes. Magnetic investigations reveal field-induced single-ion magnet behavior in all complexes with distinct energy barriers (Ueff) −39.06 (1), 36.65 (2), 36.32 (3), 28.26 (4) and 15.85 K (5). Magnetic experiments together with HF-EPR measurements and theoretical calculations demonstrate that 2 features easy-axis magnetic anisotropy (D = −60.48 cm−1), whereas 3-5 show easy-plane magnetic anisotropies − D = +70.77 cm−1 for 3, +35.71 cm−1 for 4, and +51.28 cm−1 for 5. To our knowledge, such reversal of anisotropic nature driven by coligands is unprecedented
    • …
    corecore